Fault Tolerance Grid Scheduling with Checkpoint Based on Ant Colony System
نویسندگان
چکیده
Corresponding Author: Saufi Bukhari School of Computing, Universiti Utara Malaysia, Malaysia Email: [email protected] Abstract: Task resubmission and checkpoint are among several popular techniques used in providing fault tolerance in grid computing. However, due to the lack of side-by-side comparison, it is not certain of the best technique that would not degrade the system performance in addition to providing fault tolerance capability. This study proposed Dynamic ACSbased Fault Tolerance in grid computing using resubmission to new resource, checkpoint technique and utilization of resource execution history with the aim to reduce execution and task processing time and to increase the success rate in grid environment. The proposed algorithm is compared with other relevant algorithms to measure the performance in terms of execution time, success rate and average processing time. The results suggest that the proposed algorithm with improved task resubmission, checkpoint and extended pheromone update formula gives better performance in managing execution failure as well as resource selection during task assignment or resubmission.
منابع مشابه
Stability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid
Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...
متن کاملAn improved ant colony optimization algorithm with fault tolerance for job scheduling in grid computing systems
The Grid scheduler, schedules user jobs on the best available resource in terms of resource characteristics by optimizing job execution time. Resource failure in Grid is no longer an exception but a regular occurring event as resources are increasingly being used by the scientific community to solve computationally intensive problems which typically run for days or even months. It is therefore ...
متن کاملFault Tolerant ACO using Checkpoint in Grid Computing
This paper proposed an algorithm for fault tolerant distributed computation in a grid by the means of meta-heuristic Ant Colony Optimization (ACO) technique and Check-Pointing. Load Balancing is the process of distributing the workload among the nodes in a grid. The load can be CPU cycles, memory capacity or network load. Due to the emerging computing methodology of grid computing over the hete...
متن کاملAn Improved min - min Algorithm for Job Scheduling using Ant Colony Optimization
Grid computing is recognized as one of the most powerful vehicles for high performance computing for data-intensive scientific applications. Grid is alternative to traditional distributed computing. It addresses issues such as resource discovery, heterogeneity, fault tolerance and task scheduling. Scheduling is the one of the current issue in the complex heterogeneous environment. Job schedulin...
متن کاملA Peer-to-Peer Meta-Scheduler for Service-Oriented Grid Environment
Meta-scheduling in a Grid is aimed at enabling the efficient sharing of computing resources managed by different local schedulers within a single organization or scattered across several administrative domains. Since current Grid metaschedulers operate in a centralized fashion and thus are single points of failure, we present a distributed meta-scheduler for a service-oriented Grid environment ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JCS
دوره 13 شماره
صفحات -
تاریخ انتشار 2017